Optimization and Scheduling of Applications in a Heterogeneous CPU-GPU Environment

نویسندگان

  • Karan Rajendra Shetti
  • Suhaib A. Fahmy
  • Suhaib Fahmy
  • Timo Bretschneider
چکیده

With the emergence of General Purpose computation on GPU (GPGPU) and corresponding programming frameworks (OpenCL, CUDA), more applications are being ported to use GPUs as a co-processor to achieve performance that could not be accomplished using just the traditional processors. However, programming the GPUs is not a trivial task and depends on the experience and knowledge of the individual programmer. The main problem is identifying which task or job should be allocated to a particular device. The problem is further complicated due to the dissimilar computational power of the CPU and the GPU. Therefore, there is a genuine need to optimize the workload balance. This thesis presents the work done toward the author’s post graduate study and describes the optimization of the Heterogeneous Earliest Finish Time (HEFT) algorithm in the CPU-GPU heterogeneous environment. In the initial chapters, different scheduling principles available are described and an in depth analysis of three state of the art algorithms for the chosen heterogeneous environment is presented. A comparison of fine-grained with coarse-grained scheduling paradigms is also studied. Using state of the art StarPU scheduling framework and exhaustive benchmarks, it is shown that the fine grained approach in much more efficient for the CPU-GPU environment. A novel optimization of the HEFT algorithm that takes advantage of dissimilar execution times of the processors is proposed. By balancing the locally optimal result with the globally optimal result, it is shown that performance can be improved significantly without any change in the complexity of the algorithm (as compared to HEFT). HEFT-NC (No-Cross) is compared with HEFT both in terms of speedup and schedule length. It is shown that the HEFT-NC outperforms HEFT algorithm and is consistent across different graph shapes and task sizes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Resource Utilization in Heterogeneous CPU-GPU Systems

Graphics processing units (GPUs) have attracted enormous interest over the past decade due to substantial increases in both performance and programmability. Programmers can potentially leverage GPUs for substantial performance gains, but at the cost of significant software engineering effort. In practice, most GPU applications do not effectively utilize all of the available resources in a syste...

متن کامل

Implementation of the direction of arrival estimation algorithms by means of GPU-parallel processing in the Kuda environment (Research Article)

Direction-of-arrival (DOA) estimation of audio signals is critical in different areas, including electronic war, sonar, etc. The beamforming methods like Minimum Variance Distortionless Response (MVDR), Delay-and-Sum (DAS), and subspace-based Multiple Signal Classification (MUSIC) are the most known DOA estimation techniques. The mentioned methods have high computational complexity. Hence using...

متن کامل

A Study of Scheduling a Neuro - imaging Application On a Heterogeneous CPU - GPU Cluster by Reza Nakhjavani

A Study of Scheduling a Neuro-imaging Application On a Heterogeneous CPU-GPU Cluster Reza Nakhjavani Master of Applied Science Graduate Department of Electrical and Computer Engineering University of Toronto 2014 The ever increasing complexity of scientific applications has led to utilization of new HPC paradigms such as Graphical Processing Units (GPUs). However, modifying applications to run ...

متن کامل

Intelligent Scheduling for Simultaneous Cpu - Gpu Applications

Heterogeneous computing systems with both general purpose multicore central processing units (CPU) and specialized accelerators has emerged recently. Graphics processing unit (GPU) is the most widely used accelerator. To fully utilize such a heterogeneous system’s full computing power, coordination between the two distinct devices, CPU and GPU, is necessary. Previous research has addressed this...

متن کامل

Scalability and Parallel Execution of OmpSs-OpenCL Tasks on Heterogeneous CPU-GPU Environment

With heterogeneous computing becoming mainstream, researchers and software vendors have been trying to exploit the best of the underlying architectures like GPUs or CPUs to enhance performance. Parallel programming models play a crucial role in achieving this enhancement. One such model is OpenCL, a parallel computing API for cross platform computations targeting heterogeneous architectures. Ho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014